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Abstract 

We determine under which conditions certain natural models of random constraint 
satisfaction problems have sharp thresholds of satisfiability. These models include graph 
and hypergraph homomorphism, the (d, k, i)-model, and binary constraint satisfaction 
problems with domain size 3. 



1 Introduction 



Random 3-SAT and its generalizations have been studied intensively for the past decade or 
so (see eg. [1, 5, 8, 17, 25, 2, 7, 13, 15, 34, 9]). One of the most interesting things about 
these models, and arguably the main reason that most people study them, is that many of 
them exhibit what is called a sharp threshold of satisfiability 1 , a critical clause-density at which 
the random problem suddenly moves from being almost surely 1 satisfiable to almost surely 
unsatisfiable. Most of the work on these problems is, at least implicitly, an attempt to determine 
the precise locations of their thresholds. At this point, these locations are known only for a 
handful of the problems, such as [2, 7, 13, 15, 34, 9]. Just proving the existence of a sharp 
threshold for random 3-SAT was considered a major breakthrough by Friedgut[17]. The vast 
majority of these generalizations appear to have sharp thresholds, but there are exceptions 
which are said to have coarse thresholds 1 . 

The ultimate goal of the present line of enquiry is to determine precisely which of these models 
have sharp thresholds, but this appears to be quite difficult; in Section 2 we show that it is 
at least as difficult as determining the location of the threshold for 3-colourability, something 
that has been sought after for more than 50 years (see eg. [14, 28]). A more fundamental goal 
is to obtain a better understanding of what can cause some problems to have coarse thresholds 
rather than sharp ones. 

Molloy[27] and independently Creignou and Daude[10] introduced a wide family of models for 
random constraint satisfaction problems which includes 3-SAT and many of its generalizations. 
This permits us to study them under a common umbrella, rather than one-at-a-time. Molloy 
determined precisely which models from this family have any threshold at all ([10] provides the 
same result for those models with domain 1 size 2) . But he left open the much more important 
question of which models have sharp thresholds. In this paper, we begin to address this question. 
We answer it for two of the most natural subfamilies - the so-called (d, k, t)-family 1 (Theorem 
2), and the family of graph and hypergraph homomorphism problems (Theorem 4). We also 
shed light on the more fundamental problem by determining the only properties that can cause 
a coarse threshold in binary constraint satisfaction problems with domain size 3. 

The standard example of a problem with a coarse threshold is 2-colourability. Here, there 
is a coarse threshold precisely because unsatisfiability (i.e. non-2-colourability) can be caused 
only by the presence of odd cycles. Roughly speaking, Friedgut's theorem[17] implies that a 
problem exhibits a coarse threshold iff unsatisfiability is approximately equivalent to having one 
of a set of unicyclic 1 subproblems. It is not hard to see that if there are unsatisfiable unicyclic 
instances of a problem then that problem exhibits a coarse threshold (or exhibits no threshold 
at all). This makes it quite natural to pose the following rule-of-thumb: 

Hypothesis A: // a random model from the family in [27] is such that: (a) it exhibits a 
threshold, and (b) every unicyclic instance is satisfiable, then that threshold is sharp. 

However, reality is not that simple. [27] presents a counterexample to Hypothesis A; others 
are presented in this paper. Nevertheless, the hypothesis holds for certain subfamilies of models. 
Creignou and Daude[10] conjectured that Hypothesis A holds for problems with domain-size 

1 Defined formally below. 
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two; this was proven by Istrate[23] and independently Creignou and Daude[ll] proved it for 
the case where the model is symmetric. Theorems 2 and 4 in this paper show that Hypothesis 
A holds for the (d, k, t)-models and for homomorphism problems. 

In general, coarse thresholds can be caused by much more subtle and insidious reasons than 
unsatisfiable unicyclic instances. In this paper we begin to understand some of these reasons by 
focusing on the case where the constraint size is two and the domain size is three (a natural next 
step after the well-understood domain-size-two case). In this paper, we identify a particular 
subtle cause, and show that this and unsatisfiable unicyclic instances are the only things that 
can cause a coarse threshold (Theorem 14). If we permit either greater domain sizes or greater 
constraint sizes then this is no longer true - there are other possible causes. 

1.1 The random models 

In our setting, the variables of a constraint satisfaction problem (CSP) all have the same 
domain of permissable values, {l,...,d}, and all constraints will have size k, for some fixed 
integers d, k. Given a fc-tuple of variables, (xi, a restriction on (xi, x^) is a Axtuple of 
values R = (Si, ...Sk) where each 1 < S^ < d. For each fc-tuple (xi, ...,Xk), the set of restrictions 
on that /c-tuple is called a constraint. The empty constraint is the constraint which contains no 
restrictions. We say that an assignment of values to the variables of a constraint C satisfies C 
if that assignment is not one of the restrictions in C. An assignment of values to all variables 
in a CSP satisfies that CSP if every constraint is simultaneously satisfied. A CSP is satisfiable 
if it has such a satisfying assignment. 

It will be convenient to consider a set of canonical variables Xi, ...,Xk which are used only 
to describe the "pattern" of a constraint. These canonical variables are not variables of the 
actual CSP. For any d, k there are d k possible restrictions and 2 dk possible constraints over the 
k canonical variables. We denote this set of constraints as C d,k . For our random model, one 
begins by specifying a particular probability distribution, V over C d,k . We use supp("P) to 
denote the support of V; i.e. the set of constraints C with V(C) > 0. Different choices of V 
give rise to different instances of the model. 

We now define our random models. The "G n ,M" model, where the number of constraints is 
fixed to be M, is the most common. But in this paper, it will be much more convenient to focus 
on the "G„ iP " model where each /c-tuple of variables is chosen independently with probability 
p = c/n k ~ l to receive a constraint. The two models are, in most respects, equivalent when 
M = (c/k\)n. In particular, it is straightforward to show that one exhibits a sharp threshold 
iff the other does. 

The C S 'P n , P (V) Model: Specify n,p and V (typically p = c/n k ' 1 for some constant c; note 
that V implicitly specifies d,k). First choose a random /c-uniform hypergraph on n variables 
where each of the (j^j potential hyperedges is selected with probability p. Next, for each 
hyperedge e, we choose a constraint on the k variables of e as follows: we take a random 
permutation from the k variables onto {Xi, X k } and then we select a random constraint 
according to V and map it onto the k variables. 

A property holds almost surely (a.s) if the limit as n — > oo of it holding is 1. We say that 
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CSP n<p (V) has a sharp threshold of satisfiability if there is some c = c(n) > such that for 
every e > 0, if p — (1 — e)c/n k ~ 1 then CSP ntP (V) is a.s. satisfiable and if p = (1 + e)c/n k ~ 1 
then CSP ntP (V) is a.s. unsatisfiable. This is often abbreviated to just sharp threshold. We say 
that CSPn tP (V) has a coarse threshold if for all c in some interval ci(ra) < c < C2(n), CSP ntP (V) 
is neither a.s. satisfiable nor a.s. unsatisfiable. If CSP njP (V) has neither a sharp nor a coarse 
threshold, then it is easy to see that it must either be a.s. satisfiable for all c > or a.s. 
unsatisfiable for all c > 0. 

Each fc-tuple of vertices can have at most one constraint in CSP n)P (V). When applying 
Friedgut's theorem, it will be convenient to relax this condition, and allow A;-tuples to possibly 
receive multiple constraints. Thus up to k\ x |supp("P)| constraints can appear on a fc-tuple of 
variables. 

The CSP niP (V) Model: Specify n,p and V. For each of the n(n — \)...{n — k + 1) ordered 
fc-tuples of variables and each constraint C G supp("P), we assign C to the ordered fc-tuple 
with probability V(C) x p/k\. 

Note that the expected total number of constraints is the same under each model. Further- 
more, it is easy to calculate that the probability of at least one fc-tuple receiving more than 
one constraint in CSP n #(V) is for k > 3, o(l) and for k = 2, an absolute constant < a < 1. 
It follows that if a property holds a.s. in CSP nyP {V) then it holds a.s. in CSP ntP (V). As a 
corollary, we have: 

Lemma 1 If CSP ntP (V) has a sharp threshold then so does CSP ntP (V). The reverse is true 
for k > 3. 

So for the remainder of the paper, whenever we wish to prove that CSP ntP (V) has a sharp 
threshold, we will work in the CSP ntP (V) model. 

We often focus on the constraint hypergraph of a CSP; i.e. the hypergraph whose vertices 
are the variables and whose edges are the tuples of variables that have constraints. A tree- 
CSP is a CSP whose constraint hypergraph is a hypertree. A CSP is unicyclic if its constraint 
hypergraph is unicylic; i.e. has exactly one cycle. (Hypertree and cycle are defined below). 

We close this subsection with some hypergraph definitions. A hypergraph consists of a set 
of vertices and a set of hyperedges, where each hyperedge is a collection of vertices. If every 
hyperedge has size exactly k then the hypergraph is k-uniform. In a simple hypergraph, no 
vertex appears twice in any one hyperedge, and no two edges are identical. So, for example, 
the constraint hypergraph of CSP n>p (V) is simple, but the constraint hypergraph of CSP ntP (V) 
may have multiple edges. Neither model permits multiple copies of a vertex in a single edge, 
but such edges are possible when we discuss hypergraph homomorphism problems. The edge 
(v, v, v) is called a loop. 

A walk P of size r is a sequence of r hyperedges and r + 1 vertices (vo, ei, v±, v-i . . . , e r , v r ) 
such that Ci contains both V{-\ and V{. A walk is a path if the Vi are distinct. A walk is a cycle 
of size r if for % — 1, . . . , r the i>j and are distinct, and v — v r . The distance from a vertex 
u to a vertex v is the minimum r such that there exists a walk of length r, (v, e±, v±, . . . , e r , u); 
the distance of a vertex from itself is defined to be 0. The distance from a vertex v to a set of 
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vertices is the minimum distance from v to any vertex in the set. A hypergraph is a hypertree 
if it has no cycles and it is connected. 

By contracting two vertices u and v into a new vertex w, we mean (i) adding a new vertex w 
to the set of the vertices, (ii) replacing u and v in every hyperedge by w, and (hi) removing u 
and v. 

1.2 Two special families 

Perhaps the most natural choice for V is the distribution obtained by selecting each of the d k 
possible restrictions independently with probability l/d k . However, as noted in [4], every such 
choice of V yields a model that is a.s. unsatisfiable for any non-trivial choice of p. So this is a 
rather uninteresting family of models, particularly as far as the study of thresholds goes. 

The next most natural choice for V is to fix t, the number of restrictions per clause, and 
to make every constraint with exactly t restrictions equally likely. (Note that for d — 2, t — 1 
this yields random /c-SAT.) This is often called the (d, k, t)-model and has received a great deal 
of study, both from a theoretical perspective [26, 30] and from experimentalists (see [18] for a 
survey of many such studies). In [4] it is shown that when t > d k_1 , this model is problematic 
in the same way as the previously mentioned one, as it is a.s. unsatisfiable even for values of 
p = o(l/n fc_1 ) (i.e. when the number of constraints is o(n)). However, it was proven in [18] 
that for every 1 < t < d k_1 , the (d, k,t)-mode\ does not have that problem. One of the main 
contributions of this paper is to show that in this case the model exhibits a sharp threshold: 

Theorem 2 For every d, k > 2 and every 1 < t < d k ~ x , the (d, k,t)-model has a sharp thresh- 
old. 

From a different perspective, it is quite natural to consider the case where every constraint 
is identical, i.e. |supp("P)| = 1. It is not hard to see that every such problem is equivalent to 
a hypergraph homomorphism problem, as defined below: 

For two A;-uniform hypergraphs, G, H, a homomorphism from G to H is a mapping h from 
V(G) to V(H) such that for each edge (v 1: v 2 , ■ ■ ■ , v k ) of G, (h(vi), h(v 2 ), ■ ■ ■ , h(v k )) is an edge 
of H. We say that G is homomorphic to H, if there exists such a homomorphism. When k = 2 
and H is the complete graph with no loops, we are simply asking whether G has a <i-colouring. 
Homomorphisms are an important generalization of graph colouring (see, eg. [21]). They are 
often also referred to as if-colourings (eg. [22, 20]). 

Suppose that if is a fixed hypergraph, and G is a random hypergraph on n vertices where 
each of the (fy potential hyperedges is selected with probability p. Set d to be equal to the 
number of vertices in H and define a constraint C with domain size d and constraint size k by 
saying that C permits Xi — 8i, X^ — 8^ iff (8i, 8k) is a hyperedge of H . Treat each vertex 
of G as a variable with domain {1, .., d} and assign C to each hyperedge of G. We call this the 
H -homomorphism problem. 

Thus we have an instance of CSP n>p (V) where C is the only constraint in supp('P) and 
furthermore C is symmetric under permutations of the canonical variables; in other words, all 
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constraints are identical even under permutations of variables. It is easy to see that every such 
V corresponds to a homomorphism problem; just take H to be the hypergraph where (Si, 5k) 
is a hyperedge iff C permits Xi — Si, ...,Xk = 5k- Note that here a hyperedge in H may contain 
multiple copies of a vertex. 

Thus, these iJ-homomorphism problems are not only important as a fundamental graph 
problem, but also because they form a very natural subclass of our family of random CSP 
models. In this paper, we prove that Hypothesis A holds for every connected H . 

It is easy to see that if H has a loop (5,5, 5) then every hypergraph is trivially homomorphic 
to H (just map every vertex to 5); so the if-homomorphism problem has no threshold at all. 
The other trivial case is where H has no hyperedges at all and so no non-trivial hypergraph 
has an iJ-homomorphism. 

Lemma 3 Suppose that H is a nontrivial hypergraph with no loops. We have the following: 

1. For k > 3, every unicylic hypergraph is homomorphic to a single hyperedge, and hence to 
H. 

2. For k = 2: if the triangle is homomorphic to H, then so is every unicyclic graph; and the 
triangle is homomorphic to H iff H contains a triangle. 

Proof. To prove part (1), let (vq, ti,vi, e2, i>2 • • • , e r , vo) be the unique cycle of the hypergraph, 
and let (wq, . . . , uik-i) be a single hyperedge. Define h(vi) = wu mo d 2), for every < % < r — 2 
and h(v r -i) = W2- It is easy to see that one can extend h to a homomorphism from the unicyclic 
hypergraph to H . 

Part (2) easily follows from the easy and well-known fact that every cycle is homomorphic 
to the triangle, and the triangle is not homomorphic to any cycle of size greater that 3. ■ 

From Lemma 3 we conclude that proving that Hypothesis A holds whenever H is connected 
and undirected is equivalent to proving: 

Theorem 4 If H is a connected undirected loopless hypergraph with at least one edge, then the 
H -homomorphism problem has a sharp threshold iff (a) k > 3 or (b) k = 2 and H contains a 
triangle. 

We do not have a strong feeling as to whether the "connected" condition is necessary here; 
we discuss the possibility of extending Theorem 4 to disconnected graphs in Section 3. 

1.3 Tools 

Our main tool is distilled from Friedgut's main theorem in [17]. Friedgut reported to us[19] that 
his proof can be adapted to the setting of this paper. To provide Friedgut's theorem for CSP's in 
its full power instead of being restricted to the unsatisfiability property, we consider, as Friedgut 
did, every monotone property where a property A is called monotone if it is preserved under 
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constraint addition. A property A on CSP's is called monotone symmetric if it is monotone 
and invariant under CSP automorphisms. For a property A, A n denotes the restriction of 
A on CSP's with exactly n variables. Roughly speaking, Friedgut's theorem says that for a 
value of p that is "within" the coarse threshold, there is a constant sized instance M such that 
t < Pr[M C CSP n>p (V)] < 1 — t for some constant r which does not depend on n, and adding 
M to our random CSP boosts the probability of being in A by at least 2a > 0, whereas adding a 
linear number of new random constraints only boosts it by at most a. First, we must formalize 
what we mean by "adding M". Given two CSP's M,F where M has r variables, and F has 
at least r variables, we define F © M to be the CSP obtained by choosing a random r-tuple of 
variables in F and then adding M on those r variables. Now we can state Friedgut's theorem 
formally: 

Theorem 5 Let A = {Ai} be a series of monotone symmetric properties in CSP ntP (V) with 
a coarse threshold. There exist, p = p{n), r,a,e > 0, a CSP M whose constraints are chosen 
from supp("P) such that for an infinite number of n: 

(a) a < Pi[CSP n<p (V) E A] < 1 - 3a. 

(b) r < Pr[M C CSP n , p {V)\ < 1 - t. 

(c) Pr[CS~P n:P (V) © M G A] > 1 - a. 

(d) Pr[CSP nMl+e) (V) e A] < 1 - 2a. 

When as in our setting p(n) = c(n)/n k ~ 1 , Theorem 5(b) implies that M is a unicycle CSP. 
So we obtain the following corollary which is our main tool in this paper. 

Corollary 6 For any V, if C SP niP {V) has a coarse threshold of satisfiablity then there exist 
p = p{n), a,e > 0, and a unicyclic CSP M on a constant number of variables whose constraints 
are chosen from supp("P) such that: 

(a) a < Pr[CSP niP (V) is unsatisfiable] < 1 — 3a. 

(b) Pr[C 'S Pn iP (i +€ )(V) is unsatisfiable] < 1 — 2a. 

(c) Pr[CS~P n , p (V) © M is unsatisfiable] > 1 - a. 

Our next tool proves some properties for local parts of a random CSP. 

Lemma 7 Suppose that p < cn l ~ k for some positive constant c, and let G be an instance of 
CSP n)P (V) . Choose a set T of t random variables. Then for every e > 0, and integer r > 
there exists an integer Lie, t, r, e) such that with the probability of at least 1 — e: 

(i): No constraint of G contains more than one variable ofT. 
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(ii): G induces a forest on the set of the variables that are of distance at most r from T . 
(Hi): There are at most L variables that are of distance at most r from T . 



Proof. Let E ± , E 2 , and E 3 denote the events (i), (ii), and (Hi) respectively. Trivially 

Pr[£i] > l-J2n k - i ^jk\p = l-o(l). (1) 

The expected number of the cycles of size at most 2r which contain at least one variable in T 
is at most t YhL 2 n^" 1 ' 1 p i . Thus 

pt[e 2 ] > i - 1 yy fe -*-y > 1 - 1 ] = i - (i). (2) 

i=2 n 

The expected number of the variables in a distance of at most r from T is at most t X^=i n lk ~ l p l . 
So by Chebychev's inequality, for sufficiently large L: 

Pr[£ 3 ] > 1 - *5^pt > 1 - 1. (3) 

The lemma follows from (1), (2), and (3). ■ 

Our third tool is easily proven with a straightforward first moment calculation and concen- 
tration argument (via eg. the second moment method or Talagrand's inequality). 

Lemma 8 Let T be a tree-CSP whose constraints are in supp("P). There exists z = o(n 1 ^ k ) 
such that a.s. CSP ntZ (V) contains T as a sub-CSP. 



2 Difficulty 

The ultimate goal of this research is to characterize all distributions V for which CSP ntP (V) 
exhibits a sharp threshold. However, the following example indicates that this is very difficult, 
even for binary CSP's (the case where k = 2). In particular, it is at least as difficult as 
determining the location of the 3-colourability threshold, a heavily pursued open problem. 
(The existence of that threshold was proven in [3]; see [28] for a recent survey and see [6, 16] 
for the best current bounds on its location.) 

We set d — 5 and k = 2, and define two constraints by listing their pairs of forbidden values: 

d = {(1,1),(2,2)}U({1,2}x{3,4,5})U({3,4,5}x{1,2}), 

C 2 = {(3, 3), (4, 4), (5, 5)} U ({1,2} x {3,4, 5}) U ({3,4, 5} x {1,2}). 

Note that each constraint forces the endpoints of every edge to take values that are either both 
in {1, 2} or both in {3, 4, 5}. A C\ constraint says that they have to be different values if they 
are both in {1,2}. A C 2 constraint says that they have to be different values if they are both 
in {3,4,5}. 

We let C\ occur with probability q and C 2 occur with probability 1 — q in V . Set c(q) = 
(1 -q)/q. 
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Fact 9 (a) If G n>p=c ( q y n is a.s. 3-colourable, then CSP ntP (P) has a sharp threshold. 

(b) If there is some e > such that G n ,p=(c(g)-e)/n a - s - n °t 3-colourable, then CSP n>p (V) 
has a coarse threshold. 

Thus, determining the type of the threshold for all such models CSP ntP (V) requires the 
knowledge of for which values of c, G(n, ^) is a.s. 3-colourable, and for which values it is a.s. 
not 3-colourable. 

Proof Choose our CSP nyP (V) by first taking G n)P=c / n and then setting each edge to be 
Ci with probability q and C 2 otherwise. Let G 1: G 2 be the subgraphs formed by the edges 
chosen to be Ci, C 2 respectively. If c < 1 then all components of G njP=c / n are trees or unicycles 
and the CSP is trivially satisfiable. So we can focus on the range c > 1 and we let T denote 
the giant component of G njP=c / n . Note that the variables of T must either all take values from 
{1, 2} or all take values from {3, 4, 5}. 

Case 1: c > K Then Gi is equivalent to G n>p=Cl / n for some C\ > 1 and it follows easily 
that a.s. G\ contains a giant component which is not 2-colourable. This giant component is 
a subgraph of T and so the variables of T must all take values from {3,4,5}. If follows that 
the CSP is satisfiable iff G 2 is 3-colourable. Note that G 2 is equivalent to G n>p=C2 / n for some 
c 2 > c(q). 

Case 2: c < -. Then G\ is equivalent to G njP=cl / n for some c± < 1 and G 2 is equivalent to 
G n>p=C2 / n for some c 2 < c(q). If G 2 is a.s. 3-colourable then the CSP is a.s. satisfiable. If G 2 is 
a.s. not 3-colourable then the CSP is satisfiable iff T is 2-colourable; i.e., if G 2 does not have 
an odd cycle lying within T. It is easy to see that this occurs with probability between ( and 
1 — C for some ( > 0; i.e. that the CSP is neither a.s. satisfiable nor a.s. unsatisfiable. 

Fact 9 now follows. If G ntP=c ( q y n is a.s. 3-colourable, then CSP n>p (V) has a sharp threshold 
which lies somewhere above K If there is some e > such that Gn,p=(c(g)-e)/n is a - s - n °t 
3-colourable, then CSP ntP (V) has a coarse threshold running from ^ — 5 to i for some 5 > 0. □ 

3 Homomorphisms 

In this section, we prove our theorem concerning if -homomorphisms. Let G\ v denote the 
random A;-uniform hypergraph on n vertices where each fc-tuple is present as a hyperedge with 
probability p. 

Proof of Theorem 4 We begin with the case k > 3. Let H be some /c-uniform hypergraph, 
and assume that H has a coarse threshold. Let M,p,a,e be as guaranteed by Corollary 6. In 
this setting, M is a unicyclic A;-uniform hypergraph, such that adding M to boosts the 
probability of not having a homomorphism to H by at least 2a. 

Consider G = G k n , p © M. Let M + be the subgraph of G consisting of all hyperedges that 
contain at least one vertex of M (and, of course all vertices in those hyperedges); in other words, 
M + is the subhypergraph induced by the vertices of M and all their neighbours. Lemma 7 
implies that there is some constant L such that with probability at least 1 — | : M + is unicyclic 
and has at most L vertices, and no hyperedge of G k n y contains more than one vertex of M. 
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Since M is unicylic, by Lemma 3 there exists a homomorphism h from M to a single edge, 
say (vi, . . . , ffc). Let /ij be the set of the vertices in M that are mapped by h to i>j. Obtain the 
hypergraph G" from G by (i) removing all edges in M; (ii) contracting all of the vertices in hi 
into one single new vertex U{, for each 1 < % < k; (iii) adding the single hyperedge (u\, . . . , 

Suppose that h' is a homomorphism from G' to if. Then a mapping from the vertices of G 
to the vertices of H which maps every vertex v in G — M to h'(v), and every vertex in hi to 
/i'(wj) is a homomorphism from G to H. Thus, if G' is homomorphic to H then so is G. 

Let T be the hypertree defined as follows: T has a hyperedge (t±, ...,£&), and each tj lies in 
L other hyperedges. Only t\, ...,tk lie in more than one edge of T. Thus, T has + k(k — 1)L 
vertices and kL + 1 hyperedges. Note that in G' the subgraph induced by all edges containing 
{«!, . . . ,-Ufc} form a subtree of T. It follows that Gf t p © T is at least as likely to be non- 
homomorphic to H as G' is, so: 

Oi 

Pr [G^ p © M is not homomorphic to H] < Pr[G^ p © T is not homomorphic to if] + — . 

By Lemma 8, increasing p by an additional ep a.s. results in the addition of a copy of T. Thus: 
Pr[G^ p © T is not homomorphic to H] < Pr[G^ iP ( 1+e ) is not homomorphic to H] 

which yields a contradiction to Corollary 6(b). 

This proves the case where k > 3, so we now turn to the case k = 2. If if contains no triangle, 
then K% is not homomorphic to H . Thus, K% forms a unicyclic unsatisfiable CSP using the 
if-colouring constraints and so we do not have a sharp threshold. So we will focus on graphs 
H that contain a triangle. Our proof follows along the same lines as the case k > 3, but is 
complicated a bit since we can no longer assume that M is homomorphic to a single edge. We 
only highlight the differences. 

Define M + to be the subgraph of G = Gjj p © M induced by all vertices within distance 
r = + |y(M)| + 3 of the unique cycle of M. By Lemma 7 there is some constant L such 

that with probability at least 1 — f : M + is unicyclic and has at most L vertices. 

Define U to be the set of vertices of G that are of distance exactly r = + |V(M)| + 3 

from the unique cycle of M. Consider any vertex u G H. By Lemma 10 below, if M + is 
unicyclic then there is a homomorphism from M + to H such that all vertices in U are mapped 
to u. 

Obtain the graph G' from G by (i) removing all of the vertices of distance less than r from 
the unique cycle of M, and (ii) contracting U into a single new vertex u. Suppose that h' is 
a homomorphism from G' to H . Then by the previous paragraph, h' can be extended to a 
homomorphism from G to H where each vertex v G V(G') — u is mapped to h'(v), and every 
vertex in U is mapped to h'{u). Thus, if G' is homomorphic to H then so is G. 

Let T be the tree which consists of a vertex adjacent to L leaves. Since the degree of u in 
G' is at most L, and using the fact that all vertices of M are deleted when forming G' (here is 
where we require r > |M|), the rest now follows as in the k > 3 case. □ 

Lemma 10 Let H be a connected graph which contains a triangle. Let u be a vertex of H and 
M be a unicyclic graph with unique cycle C. Denote the vertices of M in a distance of exactly 
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r > + 3 from C by U . There is a homomorphism from M to H such that all vertices in 

U are mapped to u. 

Proof. Let h be a homomorphism from C to the triangle (i>i, i>2, 1*3) of H. Observe that for 
i — 1,2, 3, there exist walks (yi —)v it o, . . . , v ijT {— u) of length exactly r in H . Let w be a vertex 
in M in the distance of j < r from C, and w' be the vertex of C which has the distance j 
from w. Extend h by assigning h(w) = where h(w') = t>j. Observe that h is a partial 
homomorphism from M to H which maps every vertex in U to u. Trivially h can be extended 
to a homomorphism from M to H . ■ 

We close this section by discussing the possibilities of extending Theorem 4 to the case where 
H is disconnected. We will focus on graphs, i.e. the k = 2 case. We can show that if Hypothesis 
A does not hold for the if -homomorphism problem for every graph if, then there must be a 
counterexample with two components: a triangle and a graph Hi that is triangle-free and not 
3-colourable. First note that every cycle is homomorphic to a triangle, and a triangle is not 
homomorphic to any triangle-free graph. So the condition of Hypothesis A is equivalent to if 
containing a triangle. On the other hand if contains a triangle-free component because being 
homomorphic to if is equivalent to being homomorphic to at least one of the components Hi of 
H, and so there is some component H iy such that the i^-homomorphism problem has a coarse 
threshold. Let H 1 be the subgraph of H which consists of all triangles-free components and 
H 2 be the remaining components of H. It is easy to see that H remains a counter-example 
if we add some edges to H 1 without creating any triangle and we substitute if 2 with a single 
triangle. 

So the question of whether there is any graph H for which the if -homomorphism problem 
violates hypothesis A is equivalent to the following: 

Question 11 Is there any triangle-free graph H with x(H) > 3 such that for some values of n 
and some c > c(n), G n>p=c / n is not a.s. non-H -homomorphic, where c(n) is the threshold value 
of 3- color 'ability? 

3.1 Directed Graphs 

Here, we provide an example of a directed graph H for which 

1. every unicyclic digraph has a homomorphism to H, and 

2. the if -homomorphism problem under the CSP n:P (V) model has a coarse threshold. 

Unfortunately, this does not exhibit a coarse threshold under the CSP U:P (V) model, so the 
question of whether Hypothesis A holds for all if-homomorphism problems for directed hyper- 
graphs H is still open. 

For a directed graph D, let D denote the undirected graph that is obtained from D by 
ignoring the directions on the edges. We define D np to be the random digraph on n vertices 
where each of the n(n — 1) potential directed edges is present with probability p. Thus D n ^ p 
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possible constraining both uv and vu for some pair of vertices v,u, i.e. a 2-cycle; in fact, if 
p = c/n for a constant c, then it is straightforward to show that the probability that D contains 
at least one 2-cycle is ( + o(l) for some constant ( = ((c) < 1. 

H consists of a specific digraph Hi, defined below, and a pair of vertices U\,U2, where the 
edges u\,u 2 and u 2 ,ui are both present. Hi has the following properties: 

(i) : every unicyclic digraph which does not contain a 2-cycle a.s. has a homomorphism to Hi 

and 

(ii) : D ntP=Cl / n is not a.s. non-iJi-homomorphic for some Ci > 1/2. 

It is easy to see that any unicyclic digraph, whose cycle is a 2-cycle, has a homomorphism 
to the 2-cycle. By (i), every other unicyclic digraph has a homomorphism to Hi. Thus, 
every unicyclic digraph has an if- homomorphism, as claimed. We will show that, for every 
1/2 < c < ci, D np=c / n is neither a.s. ii-homomorphic nor a.s. non-ii-homomorphic. Thus, we 
have a coarse threshold. Condition (ii) above shows the latter, so we just need to prove the 
former. 

If c > 1/2, then the graph D for D = D np , a.s. has a giant component, as proven by Karp [?]. 
It is not hard to see that a.s. if D has a 2-cycle in the giant component of D, then there is no 
if -homomorphism: That 2-cycle must be mapped onto Ui,u 2 . Since H has no edge incident to 
{ui,u 2 }, any vertex that can be reached in D from that 2-cycle must also be mapped onto ui 
or u 2 . So the entire giant component must be mapped onto {ui,u 2 }. A.s. that component has 
an odd cycle in D, and that odd cycle cannot be mapped onto a 2-cycle. It is easy to show that 
the probability that D has a 2-cycle in its giant component is at least some positive constant. 
Therefore, D is not a.s. Jf-homomorphic. 

It remains only to prove the existence of some Hi satisfying (i), (ii). We choose H\ to be a 
tournament (i.e. for every pair of vertices, exactly one of the possible edges between them is 
present) which contains every directed graph on k vertices as a subgraph where k is a constant 
defined below. 

For an undirected graph G which does not contain any multiple edges, the oriented chromatic 
number Xo of G is the minimum number k such that every directed graph D satisfying D = G 
is homomorphic to a directed graph H with at most k vertices. The acyclic chromatic number 
of a graph G is the least integer k for which there is a proper coloring of the vertices of G with 
k colors in such a way that every cycle of G contains at least 3 different colors. It was proved 
in [31] that if the acyclic chromatic number of a graph G is bounded by k, then its oriented 
chromatic number is bounded by k.1 k ~ x . When D = D np every edge is present in D with 
probability 2p — p 2 and independent of the other edges. Thus Lemma 12 below together with 
the result of [31] imply that taking k = 6 x 2 5 and ci = c/2, H 1 satisfies (ii), where c is the 
constant which is obtained from Lemma 12. 

Lemma 12 There exists c > 1 such that a.s. the acyclic chromatic of G njP=c / n is at most 5. 

Proof. Let G = G n>p . A pendant path in G is a path in which no vertices other than the 
endpoints lie in any edge of the graph off the path. It is known that there exists c > 1 such 
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that a.s. after removing the internal vertices of pendent paths of length at least 4 from G every 
component is either a tree or it is unicycle. One can use 3 colors to color the vertices in these 
components and then use 2 other colors to color the removed vertices such that every cycle in 
C is colored by at least 3 colors. ■ 



4 The (d, k, £)-model 

Proof of Theorem 2 Suppose that the (d, k, t)-model exhibits a coarse threshold. Then 
consider p, a, e and M as guaranteed by Corollary 6. It is easy to verify that, since t < d k ~ l 
and M is unicyclic, M is satisfiable. Suppose that V(M) = u 1: ...,u r and let Oj be the value of 
u,i in some particular satisfying assignment A of M. Given a CSP F on at least r variables, we 
define F © A to be the CSP formed by choosing a random ordered r-tuple of variables v±, v r 
in F and for each 1 < % < r, forcing vi to take the value by adding a one-variable constraint 
on Vi. Clearly the probability that CSP n>p (V) © A is unsatisfiable is at least as high as the 
probability that CSP ntP (V) © M is unsatisfiable. 

Lemma 7 implies that there exists some constant L such that, defining E\ to be the event 
that every Vi has at most L neighbours, Pr(£'i) > 1 — f . Suppose that E 1 holds. Consider 
a particular V{ and expose the t < L edges containing it, e±, ...,eg and the corresponding con- 
straints Ci, Cg. For each Cj, let Cj be the (k — l)-variable constraint obtained by restricting 
Vj to be 1.6. cl (k — l)-tuple of values is permitted for Cj iff Cj permits that same (k — l)-tuple 
along with Vi = a^. Since Cj has at most t restrictions, Cj has at most t restrictions. 

Let G be a random CSP formed as follows: start with a random CSP ntP (V) and then for each 
of the t ) possible constraints on k — 1 variables and with t restrictions, choose rL random 
ordered (k — l)-tuples of variables and place that constraint on them. The probability that G 
is unsatisfiable is at least as high as the probability that CSP n:P (V) ©A is unsatisfiable as each 
batch of L copies of every (d, k — 1, t)-constraint is at least as restrictive as forcing Vi = a, L . Thus, 
adding those rL constraints boosts the probability of unsatisfiability by at least 2a. We say 

/ jfc— 1 \ 

that a canonical set of ( t jrL ordered (k — l)-tuples is bad if adding the constraints to that set 
results in an unsatisfiable CSP. So, consider the following random experiment: pick a random 
CSP njP (V) and then pick t ^jrL ordered (k — l)-tuples of the variables. The probability that 

/ jfe — 1 \ 

we pick a bad set is at least 2a. Since ( t JrL = 0(1), a simple first moment calculation shows 
that a.s. the choice of (k — l)-tuples will be vertex disjoint. Thus, the probability of picking a 
bad set is at least 2a — o(l) even if we condition on the (k — l)-tuples being vertex-disjoint. 

Define T by: (i) taking the hypergraph consisting of a vertex v lying in ) edges where 

no other vertex lies in more than one of the edges, and (ii) placing each of the ) possible 
(d, k, t)-constraints on rL of the edges. For each 1 < 8 < d, let T s denote the collection 
of (k — l)-tuples obtained by removing v from every edge containing a constraint in which 
every restriction has v = 5; note that \Tg\ = f ( )rL. By Lemma 8, there is some ( = 
o(n l ~ k ) such that CSP n:P=z (V) a.s. contains a copy of T. Consider adding that copy of T to 
CSP njP (V). The probability that for each 1 < 5 < d, Ts is a bad set is at least (2a — o(l)) d . 
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Note that if every T$ is a bad set, then the resulting CSP is unsatisfiable because setting 
v — 5 requires the set of (k — l)-constraints on T$ to be enforced. Thus, the probability that 
C S P njP=p+z (V) is unsatisfiable is at least (2a — o(l)) d . By considering adding x copies of T, 
we see that the probability that CSP ntP=p+xz (V) is satisfiable is at most (1 — {2a — o(l)) d ) x 
which is less than a for some sufficiently large constant x. Since z = o(n 1 ~ k ), this implies that 
Pr(CSP n ^ 1+e ) p (V) is unsatisfiable) > 1 — a which contradicts Corollary 6(b). □ 

5 Binary CSP's with domain size 3 

Recall that Istrate[23] (see also Creignou and Daudefll]) has proven that when the domain size 
d = 2, then Hypothesis A holds; i.e. if every unicyclic CSP is satisfiable, then CSP n:P (V) has 
a sharp threshold. This result does not extend to d — 3. Consider the following example, with 
d = 3,k = 2: 

Example 13 We have two constraints. C\ says that either both variables are equal to 1, or 
neither is equal to 1. C 2 says that the variables cannot both have the same value. V(Ci) = 
IV{C 2 ) = \. 

Observe that every unicyclic CSP that uses only constraints Ci,C 2 is satisfiable. 

Consider any | < c < 3. Thus, a.s. the sub-CSP formed by the C\ constraints has a 
giant component, and the sub-CSP formed by the C2 constraints does not. We will show that 
C S P n ,p{V) is neither a.s. satisfiable nor a.s. unsatisfiable. 

To see that it is not a.s. unsatisfiable, note that the subgraph induced by the C2 constraints 
is 2-colourable with probability at least some positive constant. This follows from the well 
known fact that the for c < 1 the random graph G(n, ^) is 2-colourable with probability at 
least some positive constant. If it is 2-colourable, then we can satisfy all the C 2 constraints by 
assigning every variable either 2 or 3; this will not violate any C\ constraints. 

To see that it is not a.s. satisfiable, note that subgraph formed by the C\ constraints has a 
giant component T. So either every variable in T is assigned 1 or none of them are. A.s. at 
least one C2 constraint has both variables in T, and so they cannot both be assigned 1. Thus, 
a.s. no variables in T can be assigned 1. This implies that if the C2 constraints form an odd 
cycle using variables of T then the CSP is not satisfiable; that event occurs with probability at 
least some positive constant. 

The main result of this section, is that when d — 3 and k — 2, if Hypothesis A fails on some 
model, then it has to fail for the same reason as it failed for Example 13. 

Consider a CSP F where every constraint is on 2 variables. Suppose there is some constraint 
on variables v,u which implies that if v is assigned 5 then u must be assigned 7; we say that 
v : 5 forces u : 7 and denote this by v : % — > u : j. Moreover if there is a sequence of variables 
Vi, . . . , v r and values 5±, . . . ,5 r such that v i : 5i — > Vi+i : Si+i for i — 1, . . . , r — 1 then we say 
that V\ : 5 1 forces v r : 5 r . 
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Theorem 14 Consider some V with d — 2, k — 3 such that every unicyclic CSP formed from 
supp("P) is satisfiable. C S P n , P {V) has a coarse threshold iff there exists a unicyclic CSP M 
formed from supp(P) 7 a value 1 < 8 < 3, p = p(n), e > 0, z > 0,b > such that: 

(a) e < Pr(CSP n:P (V) is satisfiable ) < 1 — e; 

(b) M cannot be satisfied using only the two values other than 8; 

(c) CSP n>p (V) a.s. has at least zn variables v such that v : 8 — > u : 8 for at least bn variables 
u. 

Thus, this explains the only ways that a model with d = 3, k = 2 can have a coarse threshold. 
We remark that this theorem does not extend to d — 4, k — 2 nor d = 3, k = 3; in both cases 
there are other causes for a coarse threshold. 

Proof We leave it to the reader to verify, using similar reasoning to that for Example 13 that 
conditions (a,b,c) imply a coarse threshold. The only difference here is that the set of variables 
u in (c) could change for different choices of v, and it is important to note that for each value 8, 
there is some constraint in supp(P) that forbids both variables from receiving 8 as otherwise 
CSP n:P (V) is trivially satisfiable by setting every variable equal to 8. We will give a proof of 
the other direction. The case where some domain values are bad (as defined in [27]) is easily 
disposed of, so we assume that there are no such values. 

For each variable v and each 1 < 8 < 3, 1 < 7 < 3 we define Fg n (v) to be the set of variables 
u such that v : 8 — > u : 7, and we define Fg(v) = Ui<j<^Fg t7 (v) . We can expose Fg(v) by using 
a simple breadth-first search from v. This allows us to analyze the distribution of the size of 
Fs(v) and F$ n (y) using a standard branching-process analysis (see eg. Chapter 5 of [24]). We 
say that F s>1 percolates if there are constants (,(3 > such that Pr(\F Sjl (v )| > f3n) > (. It is 
straightforward to prove: 

Claim 1: If F$ 1 does not percolate, then for every £ > there is a constant L such that 
Pr(|F 5 »|<L)>l-£. 

Claim 2: If F^ percolates, then there are constants z,b> such that a.s. there are at least 
zn variables v with |F,5 j7 (t>)| > (5n. 

Claim 2 yields that F$j percolates for at most one value 8: Suppose that Fgj and F 7 7 both 
percolate. We want to show that in this case Corollary 6(b) fails. To this end, we obtain an 
instance of CSP n ^i +e ^(V) as follows. First, we consider P , an instance of CSP ntP (V). Then 
Pi is obtained by taking the union of Pi-i and an instance of CSP n ^(V), for i = 1,2, 3. 

Let u, v, and w be variables such that in P , u : 8 forces both v : 8 and w : 8. If the restriction 
(8, 8) is added on v and w, then u cannot be assigned 8. Since there are constraints in supp('P) 
with the restriction (8,8) and constraints with the restriction (7,7) (see [27]), we can conclude 
that in P±, a.s. there are two sets of variables of size 0(n), A and B such that 8 cannot be 
assigned to any variable in A, and 7 cannot be assigned to any variable in B. 

Since Fg t s percolates, there is a value a and a constraint in supp("P) such that if that con- 
straint is applied on v\, i>2, then v\ : a — > Vi : 8. Let u be a variable in A, and v be an arbitrary 
variable. If there is a constraint on u and v which justifies that v : a — > u : 8, then a cannot be 
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assigned to v. Now we can conclude that in P 2 , almost surely, there is a set C of size 0(n) of 
variables such that a cannot be assigned to any variable in C for the reason mentioned above. 
Every variable in A or B is in C with a positive probability and independent of the other 
variables. So C\ = C n A and C2 = C fl B are both of size O(n), almost surely. Without loss of 
generality suppose that 5 7^ er (otherwise we would assume 7 7^ a). Both of the values 8 and a 
cannot be assigned to any variables in C\. So there is only one value which can be assigned to 
these variables. But then in P 3 a.s. there is a constraint on two variables in C\ which forbids 
them to be assigned this value simultaneously. So CSP niP ( 1+e ^(V) is not satisfiable a.s. which 
contradicts Corollary 6(b). 

It is less straightforward, but not very difficult, to prove: 

Claim 3: If F$ a percolates for any pair S, 7 then either (i) F$ t s percolates or (ii) there is some 
ji such that F^ percolates and there is a sequence of constraints in supp("P) through which 
v : 5 — > u : fi. 

Suppose that CSP n>p (V) has a coarse threshold and consider M,e,a,p = p(n) from Corol- 
lary 6. 

Claim 4-' There is a value 5 such that (i) that every satisfying assignment of M must use 5 
on at least one variable and (ii) F$ a percolates for at least one value 7. 

If F S j percolates then this satisfies Theorem 14. Otherwise, applying Claim 3, F^ percolates 
and there is a sequence of constraints so that v : 5 — > u : /x. Attaching that sequence to every 
variable of M yields a unicyclic CSP M' for which every satisfying assignment must use /i on 
at least one variable. Thus M',/i satisfy Theorem 14. 

Suppose that Claim 4 does not hold, and consider any satisfying assignment A of M in 
which every value S used is such that F$ t s does not percolate. Suppose that M has r vari- 
ables x\,...,Xi and that A assigns <2j to Xj. Recall from Section 4 that CSP nyP (V) © A is 
formed by taking CSP ntP (V) and then choosing r random variables v±, ...,i>j and adding one- 
variable constraints that force V{ to take a*. Clearly Pr(CSP ntP (V) © A is unsatisfiable) > 
Pr(CSP n , p (V) © M is unsatisfiable). 

Expose F = Ul =1 F ai (vi), and U, the set of variables outside of F that lie in a constraint 
with a variable in F. Since none of the F a . percolate, Claim 2 allows us to show that there is 
some L such that with probability at least 1 — a/2, \U\ < L. Since adding M to CSP ntP (V) 
increases the probability of unsatisfiability by at least 2a, it must be that the probability that 
CSP n , p (V) is satisfiable, \U\ < L and CSP nyP (V) © A is unsatisfiable is at least 3a/2. 

Suppose that CSP n ^ p (V) is satisfiable and \ U\ < L. Consider some u e U sharing a constraint 
with w G F where A forces w to take the value /i. Let Q = Q(u) be the set of values which can 
be assigned to u which, in conjunction with assigning /1 to w do not violate their constraint. 
We know that \Q\ ^ since otherwise /i is a bad value. We know that |Q| 7^ 1 since otherwise 
u G F. So > 2 for each u £ U. Suppose that ui,...,ue are the variables in U with 

\Q\ = 2, and let Si be the value not in f2(-Uj). Consider taking a random CSP formed as follows: 
first take a CSP n<p (V) and then choose £ random variables ui, ...,U£ and force U; L to not take 
value 5i using a one- variable constraint. We have proved is that the one- variable constraints 
boost the probability of unsatisfiability by 3a/2. 
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At this point, we can complete the proof using the argument from the Achlioptas-Friedgut 
proof [3] that <i-colourability has a sharp threshold for d > 3. In that paper, the starting point 
was to note that since every constraint is a colourability constraint, fixing an assignment on 
M at worst forbids one colour from each neighbour of M. This put them in the same position 
that we are in now, and so the rest of our proof is the same as theirs. □ 

We close this section by noting why this proof can not be extended to general d. The problem 
is that possibly some of the variables in U would have their domain sizes reduced by two instead 
of one. The argument in [3] cannot handle that possibility. 

6 Future Directions 

There is clearly much work still to be done along these lines of research. The big problem 
still remains - determine precisely which models from [27] have a sharp threshold. Of course, 
Section 2 indicates that this may be overly ambitious. But lowering our sights only slightly, we 
can try to determine all possible causes for coarse thresholds, i.e. continue the course started 
in Section 5. An important subgoal would be to do this for binary CSP's, i.e. the case where 
k = 2. Another reasonable goal to pursue would be to cover the d — 3 case. 

As far as more specific classes of models go, one should try to extend the work in Section 3 
and examine whether Hypothesis A holds for if-homomorphism problems when H is a directed 
hypergraph. Such homomorphism problems are equivalent to CSP's in which every constraint 
is identical under some permutation of the variables. And of course, it would be good to 
determine whether the "connected" condition can be removed from Theorem 4. 
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